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1 Introduction 

Let the random variables {Xi, X2, X3, X4} possess a multinomial distribu- 
tion A4J\f{n, pi, p2, P3, Pa) (see ||l|, Q). Just to fix notation in this case the 
joint probability function is 



P{Xi = Xi, . . . , X4 = X4} = P{xi, X4) 

1=1 



where Ylj^iPi = 1, {xi} are integers > and Y^i=i Xi = n. 

Now let {ij}, j = 1, . . . , 4 define a permutation of the numbers {1, 2, 3, 4} 

and define W = Xi^ + Xi^ . Then 

P{VF = k, Xjg = Xi^, Xi^ = n - k - Xi^} 

k 

— ^ ^ P{^n ~ h -^12 = k — i, Xi^ = Xjg, Xi^ = n — k — Xjg} 
i=0 

k , 

.^^ i\{k - i)lxi,l{n - k - Xi,)l^''^'^ 



«2 



(Pn+P.J', (1) 



because of the binomial theorem. 
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It follows {W, Xj3, Xi^} ~ A4Af{n, Pii+Pi2, Pia, Pi^)- Because the marginals 
in a multinomial distribution are binomial variables with the appropriate 
parameters we can conclude 

W = Xi,+ ~ i3(n, Pi, (2) 

Using this result define 

[Yi = Xi + X3, 

\Y2 = X2+X3. 

Then Yi ~ B{n, pi + p^), I2 ~ B{n, p2 + Ps). Because in a multinomial 
distribution Cov{Xi, Xj) = —npiPj we have 

Cov(Yi , Y2) = -npip2 - npips - np2P3 + np3(l - ^3) 
= -npiP2 + np3(l -pi-p2- Ps) 
= -npiP2 + np3P4 

= nip3P4-piP2), 

so we have a positive covariance if P3P4 — piP2 > 0. 

Finally the linear correlation coefficient between Yi and Y2 is given by 

P3P4. - P1P2 



ViPl +P3)(1 -Pl -P3)(P2 +P3)(1 -P2 -ps) 

Now suppose we want to sample from two given binomial variables with 
a given positive correlation coefficient r. Let the given variables be li ~ 
B{n, TTi) and I2 ^ 13{n, ^2). So we are assuming given the parameters 
TTi, 7r2, r. We will use the previous framework determining the parameters 
Pl) P2) P3 as functions of the new parameters. We set 

Pi+P3 = TTi and P2+P3 = T^2- 

Then 

PzPa - P1P2 



V^7ri(T^^7nj7i^(T^^7r2) 

Then 

PaPi - P1P2 = r^'Ki{\ - 7ri)7r2(l - 7r2). 
Inserting here p4 = 1 — pi — P2 — Pa, Pi = tti — p?, and p2 = 1^2 — P3 we get 



P3 



= 7ri7r2 + r^'iTi{l — 7ri)7r2(l — 1^2)- 
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Then 

pi = ni- 7ri7r2 - r^Tii{l - 7ri)7r2(l - 7^2), (3) 

P2 = 7r2 - 7ri7r2 - r^7ri(l - 7ri)7r2(l - it2). (4) 

The only problem is that we have to make sure that both p\ and p2 turn 
out to be positive. So we have to impose 

7ri(l - 7r2) > ry^7ri(l - 7ri)7r2(l - 7r2), (5) 

and 



7r2(l — TTi) > rY7ri(l - 7ri)7r2(l - 7r2). (6) 

This impose restrictions on the attainable upper bound for r. Indeed it has 
to be 

. / 7ri(l-7r2) 7r2(l-7ri) \ 

r < mm ' ' 



A/7ri(l - 7ri)7r2(l - 172)' V7nT^^^-^rI)7r^(r^^^^ 
Write TTi = ^712, P > 0. Then the above equation becomes 



r < mm 



lp{l-7T2) 1-/3^2 



1 - /37r2 'V - 7r2) 



If /3 > 1 the minimum is 



If /3 < 1 the minimum is 





/3vr2 




- 7r2)' 




- vr2) 




/3vr2 ■ 



If /3 = 1 the minimum is 1. So when Yi and I2 are identically distributed 
there is no constraint to impose on r. 

Now, when /3 > 1, if we take the derivative of the minimum with respect to 
/?, if we write 

we have 
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so that the derivative of the minimum is negative: the minimum is a de- 
creasing function of p. 

When (3 < 1, if we take the derivative of the minimum with respect to /3, if 
we write 

/(« = fc^, 

1 - PtT2 

we have 

_ (l-7r2)(l + vr2(l-/3)) 
/32(l-7r2)2 

so that the derivative of the minimum is positive: the minimum is an in- 
creasing function of /3. 

In Table 1 we present the upper bound of r for some couples of values of vri 
and TT2- Let us note that this upper bound is the same if we interchange vri 
with 7r2 and if we substitute vri with 1 — vri and 7r2 with 1 — 7r2. 

2 Regression Function 

Let us recall two consequences of the binomial theorem: 

f^ip)a^5"-^ = na(a + 6)"-^ (7) 



i=o ^^ 



n / 

n 



. a^6""^ = na(a + 6r-2[a + 6+(n-l)a]. (8) 



.=0 \^ 



Because of Equation || the joint probability function of Xi and X2 + X3 is 
given by 

P{Xi =k,X2 + X, = h,X, = n-k-h}= +P3)^ 

Because of Equation ^ the conditional distribution of Xi given X2 + X^ is 

P{X, -k\X2 + X,-h}- 

Then the conditional expectation of Xi given X2 + = h is given by, 
noticing that the above conditioning event implies < Xi < n — h, 

E{x,\X2 + Xs = h) = ^-^^J:kh:%b2~'-'' 

1 . , s , \n-h-l 



(1 -P2 -psY 



Zh{n- h)pi{pi+piY 
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where we used Equation |^. Now, since pi + pi = 1 — pi — p2, we get 
B{Xi\X2 + X3 = h) = ^ (n - h). 

1-P2-P3 

Finally the conditional expectation of Xi given X2 + X^ is 

E{Xi\X2 + Xs) = in-X2-Xs). (9) 

1 - P2 - P3 

Along the same lines we evaluate now the joint probability function of X3 
and X2 + X3 . We obtain 

n—k 

P{X3 = h,X2 + X3 = k} = = i,X2 = k-h,X3 = h,X4 = n-k-i} 

1=0 

n—k ) 

= ElTI jAiUI 1 t;P\P2'^P3P2~''~' 

i\[k — n)\n\[n — k — i)\ 

1 k—h h n—k 
^ n\P2 P3 i i n-k-i 

{k-h)\h\ ^ i!(n- 

^ Km\p2 P3 I i n-k-i 

k\{k - h)\h\ ^ i\{n -k- i)V^^^ 



'k\ n' 
h [k. 



k—hh I I ^ \n—k 



P2 PsiPl+Pi 



Then the conditional distribution of X3 given X2 + X3 is 

P{X3 = h\X2 +X3 = k} 



'k\ ptVs 



hj {p2+P?,Y' 

Hence the conditional expectation of X^ given X2 + X^ = k is given by, 
noticing that the above conditioning event implies < X^ < k, 



B{Xs\X2 + X3 = k) = EMJT^V^ 



(P2+P3, , ^ 

_f^,Sl (: + PI)"-' 

{P2 +P3) P2 \ P2J 

kp3 

P2+P3 
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where we used again Equation |^ Finally the conditional expectation of X3 
given X2 + X2, is 

E(X3|X2+X3) = ^^(X2 + X3). (10) 
V2+P?, 

It follows 

E(yiiy2) = B{Xi + x^\X2 + x^) 

= E(Xi|X2+X3)+E(X3|X2+X3) 

-(n -X2- Xs) + -^^{X2 + X,) 



I-P2-P3 P2+P3 
= an + {(3-a)Y2, (11) 

where we wrote a = , „ and 3 = -^r^. We can conclude that the 

1-P2-P3 ^ P2+P3 

regression function is linear. 

3 Conditional Variance 

Since 

P{Xi = h, X2 + X3 = k, X3 = r, X4 = n - k - h} = 

-P1P2 P3P4 , 



hl{k — r)!r!(n — k — h)\ 

the joint conditional probability function of Xi and X3 given X2 + ^3 is 
obtained as 



h / (P2 +P3)^(1 -P2 -ps! 



n— fc ' 



Since 

V h J{l-P2-P3, 

and 

P{X3=r\X2 + X; = k}-^^^ ^2-V3 



n—k '• 



r) {p2 +P3)^' 

we see that Xi and X3 are conditionally independent given X2 + X3. 
It follows that 

Var{Xi + X3IX2 + X3) = Var(Xi|X2 + X3) + Var (X3IX2 + X3) 
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We obtain 



1 (n - h\ 

;.2 / ^ \ „kn-k-h 



k 

K=U \ / 

1 — (n-%i(pi+p4r-"-' 



(1 -P2 -Ps)"-^ 

X [pi + p4 + (n - /i - l)pi] 

where we used Equation |8| and the fact that I — P2 — Ps = Pi + Pi- Thus 
Var(Xi \X2 +X3 = h) = -^l^(n - h), 

[Pi + 

that is 



{Pl +P4)^ 



Analogously 



(P2 \P2/ V P2 

X 

hp3 



l+^+(/.-l) f^) 

P2 \P2/ 



{p2 + psy 

where we used Equation |8|. It follows 

hp2P3 



[P2+P3 + {h- l)p3], 



V ar {X3\X2 + Xs = h) 



{P2 + PzY ' 
that is 

Yar{X3\Y2) = , ' .. la- 

{p2 + Psr 

Finally 

Yar{Yi\Y2)=j + SY2, 

where we set 7 = , "^^^^^ and 5 = ^^^^ - , We can conclude that 

' (pi+Pi) (P2+P3) 

the regression function is linear but not homoscedastic. 
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Table 1. Upper bound of r for selected values of ni and 1^2- 





vr2 


0.1 


0.2 


0.3 


O.-i 


0.5 


0.6 


0.7 


0.8 


0.9 


0.1 


1 


0.667 


0.509 


0.409 


0.333 


0.272 


0.218 


0.167 


0.111 


0.2 


0.667 


1 


0.767 


0.612 


0.500 


0.408 


0.327 


0.250 


0.167 


0.3 


0.509 


0.767 


1 


0.802 


0.655 


0.534 


0.428 


0.327 


0.218 


0.4 


0.409 


0.612 


0.802 


1 


0.816 


0.667 


0.534 


0.408 


0.272 


0.5 


0.333 


0.500 


0.655 


0.816 


1 


0.816 


0.655 


0.500 


0.333 


0.6 


0.272 


0.409 


0.534 


0.667 


0.816 


1 


0.802 


0.612 


0.409 


0.7 


0.218 


0.327 


0.428 


0.534 


0.655 


0.802 


1 


0.767 


0.509 


0.8 


0.167 


0.250 


0.327 


0.409 


0.500 


0.612 


0.767 


1 


0.667 


0.9 


0.111 


0.167 


0.218 


0.272 


0.333 


0.409 


0.509 


0.667 
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